Occupancy Assessment for Bats of the Pacific Northwest

Preliminary Results

Authors
Affiliations

Trent VanHawkins

On Belay Statistical Consulting

Beth Ward

Oregon State University-Cascades

Tom Rodhouse, PhD

National Park Service

Code
library(tidyverse)
library(here)
library(latex2exp)
library(sf)
library(rstan)
library(gt)
library(kableExtra)
# Read in Data
## Full 
occ_map <- readRDS(here("DataProcessed.nosync/results/stan/full/occ_map.rds"))

alpha_summ <- readRDS(here("DataProcessed.nosync/results/stan/full/alpha_summ.rds"))

beta_summ <- readRDS(here("DataProcessed.nosync/results/stan/full/beta_summ.rds"))

## Sensitivity
occ_map_sens <- readRDS(here("DataProcessed.nosync/results/stan/ORWA_Only/occ_map.rds"))

alpha_summ_sens <- readRDS(here("DataProcessed.nosync/results/stan/ORWA_Only/alpha_summ.rds"))

beta_summ_sens <- readRDS(here("DataProcessed.nosync/results/stan/ORWA_Only/beta_summ.rds"))

## define some species we will exclude
exclude <- c("tabr", "pahe", "euma")

Introduction

This purpose of this report is to document progress toward an occupancy trend analysis for bat species of the Pacific Northwest. Previously, the Northwestern Bat Hub for Population Monitoring and Research (NW Bat Hub) and its partners have documented evidence of a region-wide decline for Hoary Bats between 2016 and 2018 (T. J. Rodhouse et al. 2019). Since publication of this research, the bat hub has continued population research and monitoring under North American Bat Program (NABat) summer-monitoring protocols and has since grown to include surveys in all of Oregon, Washington, and Idaho. Here, we present the preliminary results of an updated analysis including data from 2016-2022 for all of the NABat sample units encompassed by Oregon, Washington, and Idaho.

Methods

Methods for the current study are similar to those published in Rodhouse et. al, 2019. We will provide brief detail on important components of the data and modeling procedures but further details can be found in the literature or in the unpublished report from Wright, Wilson (2020).

The Data

We queried data from NW Bat Hub database files for all included up to the time of this report (2016-2022). The NABat summer sampling design and its explanation are well-documented in the literature (Reichert et al. 2021; T. Rodhouse and Irvine 2017; T. J. Rodhouse et al. 2019; Udell et al., n.d.).

Briefly, NABat uses a spatial grid covering the continental United States in 10x10 km sample units—hereto referred to as “grid cells” or simply “cells”— with an ordered sampling priority in order to ensure a spatially balanced sample that meets statistical modeling assumptions. The current monitoring protocol utilized by the NW Bat Hub calls for one night of acoustic monitoring at each of four spatial replicates within a cell during the summer season. Ideally, each of the spatial replicates are distributed to one of the inter-cardinal 5x5 km quadrants (NE, NW, SE, SW) of a cell, but when this is not possible multiple detectors may be deployed in a single quadrant. The current modeling strategy is also flexible to >4 spatial replicates, but is not required. In instances where detectors were deployed for >1 night, only the first night was subsetted for analysis.

The Canyon Bat (Parastrellis hesperus), the Mexican Free-tailed Bat (Tadarida brasiliensis), and the Spotted Bat (Euderma maculatum) have been removed from this analysis due to small sample size or limited range.

The Model

The Bayesian framework used for this analysis is thoroughly described in T. J. Rodhouse et al. (2019), but we will briefly define the model and its parameters here. The framework is broken down into two sub-models, an occupancy sub-model and a detection sub-model.

First, we have a true underlying occupancy status \(Z_{it}\) at site \(i\) and year \(t\). This random variable can take on the value of 1 (occupied) or 0 (unoccupied). The true occupancy status is unknowable (since we cannot truly know that a site is unoccupied via acoustic monitoring), but we can say that it arises from a probability distribution with some probability that the site is occupied (\(\psi_{it}\)).

\[Z_{it} \sim \text{Bernoulli}(\psi_{it}),\]

Furthermore, we assume that the parameter \(\psi_{it}\) can be partially explained by a set of environmental covariates, \(\bf{x}\) (Table 2), for which we will estimate an effect \(\alpha\). We will also allow the probability of occupancy to vary between years and to be influenced by the status of the previous year by including the autoregressive parameter \(\alpha_{auto}\).
\[\text{logit}(\psi_{i1}) = \alpha_{01} + {\bf x}_i\alpha,\] \[\text{logit}(\psi_{it}) = \alpha_{02} + {\bf x}_i\alpha + Z_{i(t-1)}\alpha_{auto}\text{ for }t \in \{2,\ldots,T\},\]

Finally, we have a detection sub-model which estimates the conditional probability that we detected a bat species (\(p_{ijt}\)) at site \(i\) and replicate \(j\) on year \(t\) given that site \(i\) was occupied at year \(t\). We assume that \(p_{ijt}\) arises from a similar probability distribution and may also be explained by some environmental covariates \(\bf{v}\), for which we wish to estimate effects (\(\bf{\beta}\)).
\[[Y_{ijt} \mid Z_{it}=1] \sim \text{Bernoulli}(p_{ijt})\] \[\text{logit}(p_{ijt}) = \beta_0 + {\bf v}_{ijt}\beta,\]

A full summary of these parameters is provided in Table 1 and environmental covariates in Table 2. For the purpose of this report, we are most interested in reviewing the posterior probability of occupancy (\(\psi_{it}\)), effect estimates for occupancy environmental covariates (\(\bf{\alpha}\)), and effect estimates for detection environmental covariates \(\bf{\beta}\). All models were fit in the STAN programming language and model parameters extracted using rstan ver. 2.32.6.

Table 1: Model parameters and descriptions.
Parameter Description
\(\psi_{it}\) Posterior occupancy probability for site \(i\) at time \(t\)
\(\alpha_{01}\) Occupancy model intercept at year \(t = 1\)
\(\alpha_{02}\) Occupancy model intercept at year \(t = 2, \ldots, T\)
\(\bf{\alpha}\) Occupancy model effect size for environmental covariates
\(\bf{x}\) Occupancy model environmental covariates Table 2 (a).
\(\alpha_{auto}\) Autoregressive parameter allowing for different occupancy probabilities on year \(2,\ldots, T\)
\(p_{ijt}\) Conditional probability of species detection at site \(i\) and replicate \(j\) on year \(t\) given site \(i\) is occupied at year \(t\)
\(\beta_0\) Detection model intercept
\(\bf{v}\) Detection model environmental covariates Table 2 (b)
\(\bf{\beta}\) Detection model effect sizes for environmental covariates
Table 2: Covariates included in (A) the occupancy model, or (B) the detection model. All covariates were first standardized to mean = 0 and sd = 1 to promote computational efficiency with the exception of Clutter (centered only) and categorical Waterbody. Forest Cover and Cliff Cover were nautral-log transformed before standardization due to zero-inflation.
(a) Environmental covariates included in the occupancy sub-model.
Name Description
Forest Cover (%) Percentage of the sample unit that classified as forest cover (NLCD)
Precipitation (mm) Mean annual precipitation (WorldClim)
Cliff Cover (%) Percentage of the sample unit classified as cliff cover (LandFire)
(b) Environmental covariates included in the detection sub-model.
Name Description
Minimum Temperature Minimum temperature (\(C^{\circ}\)) on the night of detector deployment
Day Length Day length (previous day) on the night of deployment
Clutter (%) Categorical clutter (0 = 0%; 1 = 1-25%; 2 = 26-50%; 3 = 51-75%; 4 = 76-100%)
Waterbody Presence of a waterbody at deployment site (1 = yes; 0 = no)

Sensitivity Analysis

Previous studies released by the NW Bat Hub and its partners have focused on acoustic records captured in the 2016 to 2019 field seasons and, therefore, exclude Idaho. In order to compare model results from this analysis with those more comparable to previous research, we have conducted a sensitivity analysis using data from only Oregon and Washington. Methods for this sensitivity analysis are the same as for the primary analysis and will be presented alongside results from the full analysis.

Results

Code
n_sites <- as.numeric(length(unique(occ_map$anpa$means_map_long$cell[occ_map$anpa$means_map_long$sampled == 1])))

A total of 355 independent cells were included across the seven year period from 2016 to 2022 (Table 3; Figure 1).

Code
occ_map$anpa$means_map_long %>% 
  st_drop_geometry() %>% 
  filter(sampled == 1) %>% 
  group_by(years) %>% 
  reframe("Sample Units" = sum(sampled)) %>% 
  rename("Year" = years) %>% 
  gt() %>% 
  tab_options(table.width = pct(50)) %>% 
  cols_align("left")
Table 3: Sample units surveyed by year
Year Sample Units
2016 82
2017 60
2018 145
2019 174
2020 242
2021 267
2022 294

Partners at in Idaho joined the NABat effort coordinated by the NW Bat Hub during the 2020 season, and sampling efforts are reflected in Figure 1.

Code
occ_map$anpa$means_map_long %>%
  mutate(sampled = factor(sampled, labels = c("No", "Yes"))) %>% 
  ggplot()+
  geom_sf(aes(fill = factor(sampled)))+
  facet_wrap(~years)+
  scale_fill_discrete(type = c("transparent", "blue"))+
  labs(fill = "Sampled")+
  theme_bw()+
  theme(axis.text.x = element_blank(),
          axis.text.y = element_blank(),
          axis.ticks.x = element_blank(),
          axis.ticks.y = element_blank(),
        legend.position = "bottom")
Figure 1: Map of NABat grid cells (10x10 km) sampled by year.

Modeling Results

Covariates

Figure 2 gives summaries of the posterior distributions for each environmental covariate from the occupancy sub-model, for each species included in the analysis. Please note that these results are preliminary and are subject to change.

Code
#Alphas
alpha_summ <- alpha_summ %>% mutate(analysis = "OR|WA|ID")
alpha_summ_sens <- alpha_summ_sens %>% mutate(analysis = "OR|WA Only")

alpha_full <- bind_rows(alpha_summ, alpha_summ_sens) 


  
alpha_full %>% 
  filter(!(alpha %in% c("int01", "int02", "alpha_auto")),
         !(spp %in% exclude)) %>% 
  mutate(spp = toupper(spp),
         alpha = factor(alpha, labels = c("Forest Cover",
                                          "Precipitation",
                                          "Cliff Cover"))) %>%
ggplot(aes(x = spp, y = mean, group = analysis))+
  geom_errorbar(aes(ymin = q2.5, ymax = q97.5), width = 0, size = 0.75,
                position = position_dodge(width = 0.5)) +
  geom_errorbar(aes(ymin = q25, ymax = q75, color = analysis), width = 0,
                size = 1.5, position = position_dodge(width = 0.5)) +
  geom_point(position = position_dodge(width = 0.5)) +
  theme_bw() +
  facet_grid(alpha ~ ., scales = 'free') +
  geom_hline(yintercept = 0, lty = 2) +
  xlab('Species') + 
  ylab('Posterior Distribution (Log-Odds)') +
  theme(axis.text.x = element_text(angle = 45, hjust = 1, size = 10),
        axis.text.y = element_text(size = 10)) +
  ggtitle('Comparing occupancy coefficients (Mean & 95% CI)',
          'Single-season, multi-species model')+
  labs(color = "Legend") 
Figure 2: Summary of posterior distributions for occupancy model covariates, by species. Point locations represent the posterior means, colored regions represent the 50% credible interval, and black lines represent the 95% credible interval. Outputs provided on the log-odds scale.

Figure 2 shows stable estimates with the exception of some large values from the posterior distribution of the coefficient for precipitation for models of the california myotis (MYCA) and the little brown bat (MYLU). Estimates are reported on the log-odds scale and can be generally be interpreted as “a change of 1 standard deviation in X is associated with an increase of Y in the log-odds of occupancy”.

Figure 3 gives summaries of the posterior distributions for each environmental covariate from the detection sub-model, for each species included in the analysis. Again, please note that these results are preliminary and are subject to change.

Code
#Betas
beta_summ <- beta_summ %>% mutate(analysis = "OR|WA|ID")
beta_summ_sens <- beta_summ_sens %>% mutate(analysis = "OR|WA Only")

beta_full <- bind_rows(beta_summ, beta_summ_sens) 

beta_full %>% 
  filter(!(betas %in% c("Intercept")),
         !(spp %in% exclude)) %>% 
  mutate(spp = toupper(spp),
         betas = factor(betas, labels = c("Min. Temp.", "Day Length", "Clutter", "Water"))) %>% 
ggplot(aes(x = spp, y = mean, group = analysis))+
  geom_errorbar(aes(ymin = q2.5, ymax = q97.5), width = 0, size = 0.75,
                position = position_dodge(width = 0.5)) +
  geom_errorbar(aes(ymin = q25, ymax = q75, color = analysis), width = 0,
                size = 1.5, position = position_dodge(width = 0.5)) +
  geom_point(position = position_dodge(width = 0.5)) +
  theme_bw() +
  facet_grid(betas ~ ., scales = 'free') +
  geom_hline(yintercept = 0, lty = 2) +
  xlab('Species') + 
  ylab('Posterior Distribution (Log-Odds)') +
  theme(axis.text.x = element_text(angle = 45, hjust = 1, size = 10),
        axis.text.y = element_text(size = 10)) +
  ggtitle('Comparing detection coefficients (Mean & 95% CI)',
          'Single-season, multi-species model')+
  labs(color = "Legend") 
Figure 3: Summary of posterior distributions for detection model covariates, by species. Point locations represent the posterior means, colored regions represent the 50% credible interval, and black lines represent the 95% credible interval. Outputs provided on the log-odds scale.

Figure 3 again shows stable estimates for model coefficients across species. These results may also be interpreted as “a change of 1 standard deviation in X is associated with an increase of Y in the log-odds of detection, given that the species is present.”

Posterior Predictions

Figure 4 gives a summary of the posterior distribution of estimated occupancy probabilities across all predicted grid cells in the three-state region of Oregon, Washington, and Idaho. Again, these are preliminary results and are subject to change.

Code
psi_summ <- function(occ_map){
  for (i in 1:length(occ_map)) {
  psi_temp <- occ_map[[i]]$means_map_long%>% 
    as.data.frame() %>% 
    group_by(years) %>% 
    summarise(mean_psi = mean(mean),
              q2.5 = quantile(mean, 0.025),
              q25 = quantile(mean, 0.25),
              q75 = quantile(mean, 0.75),
              q97.5 = quantile(mean, 0.975),
              spp = names(occ_map)[[i]]) %>% 
    ungroup()
  
  if (i == 1) {
    all_psi <- psi_temp
  }
  else{all_psi <- bind_rows(all_psi, psi_temp)}
  }
  
  return(all_psi)
}

psi_full <- psi_summ(occ_map = occ_map) %>% mutate(analysis = "OR|WA|ID")
psi_sens <- psi_summ(occ_map = occ_map_sens) %>% mutate(analysis = "OR|WA Only")

all_psi <- bind_rows(psi_full, psi_sens)

all_psi %>% 
  filter(!(spp %in% exclude)) %>% 
  mutate(spp = toupper(spp)) %>% 
ggplot(aes(x = years, y = mean_psi, group = analysis))+
  geom_errorbar(aes(ymin = q2.5, ymax = q97.5), width = 0, size = 0.75,
                position = position_dodge(width = 0.5)) +
  geom_errorbar(aes(ymin = q25, ymax = q75, color = analysis), width = 0,
                size = 1.5, position = position_dodge(width = 0.5)) +
  geom_point(position = position_dodge(width = 0.5)) +
  theme_bw() +
  ylim(c(0,1))+
  facet_grid(spp ~ .) +
  xlab('Species') + 
  ylab(TeX("$\\hat{psi}$")) +
  ggtitle('Comparing detection coefficients (Mean & 95% CI)',
          'Single-season, multi-species model')+
  labs(color = "Legend") 
Figure 4: Region-wide summary of the posterior distribution of \(\hat{psi_t}\) along with 50% credible interval (colored region) and 95% credible interval (black line).

Occupancy maps

Maps of the posterior predictions for all species displayed in Figure 4 are included with this report and visualize the mean and 95% credibile intervals of the posterior distribution of \(\hat{\psi_t}\). Figure 5 gives an example of the map for mean posterior occupancy probabilities and Figure 6 displays uncertainty around those means. Interestingly, uncertainty is consistently high in the Hell’s Canyon wilderness region— a rugged and remote region that has not been yet been captured by monitoring efforts. An overlay of sampling effort with uncertainty is shown in Figure 7.

Code
occ_map$myci$means_map_long %>% 
    ggplot()+
    geom_sf(aes(fill = mean), size = 0.1, lwd = 0.05) +
    facet_wrap(. ~ years, ncol = 4) +
    viridis::scale_fill_viridis(limits = c(0, 1)) +
    theme(axis.text.x = element_blank(),
          axis.text.y = element_blank(),
          axis.ticks.x = element_blank(),
          axis.ticks.y = element_blank(),
          legend.position = "bottom")+
    ggtitle('MYCI Posterior Mean Occupancy',
            'Multi-season, single-species model')+
  labs(fill = "Posterior Mean Occ.")
Figure 5: Mean posterior occupancy probabilities visualized for the small-footed bat. Cool colors indicate low probability of occurrence on year t and warm colors indicate a high probability.
Code
occ_map$myci$ci_map_long %>% 
    ggplot()+
    geom_sf(aes(fill = width), size = 0.1, lwd = 0.05) +
    facet_wrap(. ~ years, ncol = 4) +
    viridis::scale_fill_viridis(limits = c(0, 1)) +
    theme(axis.text.x = element_blank(),
          axis.text.y = element_blank(),
          axis.ticks.x = element_blank(),
          axis.ticks.y = element_blank(),
          legend.position = "bottom")+
    ggtitle('MYCI Posterior 95% Credible Interval (Width)',
            'Multi-season, single-species model')+
  labs(fill = "95% CI")
Figure 6: 95% credible interval of posterior occupancy probabilities visualized for the small-footed bat. Warm colors indicate higher uncertainty.
Code
occ_map$myci$ci_map_long %>% 
  mutate(sampled = factor(sampled, labels = c("No", "Yes"))) %>% 
    ggplot()+
    geom_sf(aes(fill = width, color = sampled), size = 0.05, lwd = 1)+ 
    facet_wrap(. ~ years, ncol = 4) +
    viridis::scale_fill_viridis(limits = c(0, 1)) +
  scale_color_discrete(type = c("transparent", "red"))+
    theme(axis.text.x = element_blank(),
          axis.text.y = element_blank(),
          axis.ticks.x = element_blank(),
          axis.ticks.y = element_blank(),
          legend.position = "bottom")+
    ggtitle('MYCI Posterior 95% Credible Interval (Width)',
            'Multi-season, single-species model')+
  labs(fill = "95% CI",
       color = "Sampled")
Figure 7: Model uncertainty (95% Credible Interval) overlayed with sampling efforts

Discussion

This report represents a significant update to monitoring of regional trends for bat species of the pacific northwest. Further model validation strategies are required in order to make sound inference about occupancy trends for these species. Next steps are to validate posterior estimates against models fit in JAGS, another Bayesian modeling framework, and to calculate a rate of decline for each species (\(\lambda\)). Parameter estimates from this report indicate that posterior estimates may benefit from additional transformations of model covariates.

Acknowledgements

The Northwest Bat Hub thanks its contributing partners for their continued support to better understand regional status and trends of bats in the Pacific Northwest.

References

Reichert, Brian E., Mylea Bayless, Tina L. Cheng, Jeremy T. H. Coleman, Charles M. Francis, Winifred F. Frick, Benjamin S. Gotthold, et al. 2021. “NABat: A Top-down, Bottom-up Solution to Collaborative Continental-Scale Monitoring.” Ambio 50 (4): 901–13. https://doi.org/10.1007/s13280-020-01411-y.
Rodhouse, Thomas J., Rogelio M. Rodriguez, Katharine M. Banner, Patricia C. Ormsbee, Jenny Barnett, and Kathryn M. Irvine. 2019. “Evidence of Region-Wide Bat Population Decline from Long-Term Monitoring and Bayesian Occupancy Models with Empirically Informed Priors.” Ecology and Evolution 9 (19): 11078–88. https://doi.org/10.1002/ece3.5612.
Rodhouse, Thomas, and Kathryn Irvine. 2017. The Master Sample Concept as a Catalyst for Collaborative Continental-Scale Research and Monitoring.
Udell, Bradley J, Bethany R Straw, Tina Cheng, Kyle D Enns, Winfred Frick, Benjamin Gotthold, Kathryn M Irvine, et al. n.d. “Status and Trends of North American Bats Summer Occupancy Analysis 2010-2019 Data Release.” https://doi.org/10.5066/P92JGACB.
Wright, Wilson. 2020. “Code Tutorial for NABat Occupancy Analyses.”